87 research outputs found

    A note on structured pseudospectra

    Get PDF
    AbstractIn this note, we study the notion of structured pseudospectra. We prove that for Toeplitz, circulant, Hankel and symmetric structures, the structured pseudospectrum equals the unstructured pseudospectrum. We show that this is false for Hermitian and skew-Hermitian structures. We generalize the result to pseudospectra of matrix polynomials. Indeed, we prove that the structured pseudospectrum equals the unstructured pseudospectrum for matrix polynomials with Toeplitz, circulant, Hankel and symmetric structures. We conclude by giving a formula for structured pseudospectra of real matrix polynomials. The particular type of perturbations used for these pseudospectra arise in control theory

    Compensated Horner Scheme

    Get PDF
    Using error-free transformations, we improve the classic Horner Scheme (HS) to evaluate (univariate) polynomials in floating point arithmetic. We prove that this Compensated Horner Scheme (CHS) is as accurate as HS performed with twice the working precision. Theoretical analysis and experiments exhibit a reasonable running time overhead being also more interesting than double-double implementations. We introduce a dynamic and validated error bound of the CHS computed value. The talk presents these results together with a survey about error-free transformations and related hypothesis

    On the maximum relative error when computing integer powers by iterated multiplications in floating-point arithmetic

    Get PDF
    International audienceWe improve the usual relative error bound for the computation of x^n through iterated multiplications by x in binary floating-point arithmetic. The obtained error bound is only slightly better than the usual one, but it is simpler. We also discuss the more general problem of computing the product of n terms

    Algorithms for Accurate, Validated and Fast Polynomial Evaluation

    Get PDF
    International audienceWe survey a class of algorithms to evaluate polynomials with floating point coefficients and for computation performed with IEEE-754 floating point arithmetic. The principle is to apply, once or recursively, an error-free transformation of the polynomial evaluation with the Horner algorithm and to accurately sum the final decomposition. These compensated algorithms are as accurate as the Horner algorithm performed in K times the working precision, for K an arbitrary integer. We prove this accuracy property with an \apriori error analysis. We also provide validated dynamic bounds and apply these results to compute a faithfully rounded evaluation. These compensated algorithms are fast. We illustrate their practical efficiency with numerical experiments on significant environments. Comparing to existing alternatives these K-times compensated algorithms are competitive for K up to 4, i.e., up to 212 mantissa bits

    Reproducible Triangular Solvers for High-Performance Computing

    Get PDF
    On modern parallel architectures, floating-point computations may become non-deterministic and, therefore, non-reproducible mainly due to non-associativity of floating-point operations. We propose an algorithm to solve dense triangular systems by leveraging the standard parallel triangular solver and our, recently introduced, multi-level exact summation approach. Finally, we present implementations of the proposed fast repro-ducible triangular solver and results on recent NVIDIA GPUs

    A Reproducible Accurate Summation Algorithm for High-Performance Computing

    Get PDF
    International audienceFloating-point (FP) addition is non-associative and parallel reduction involving this operation is a serious issue as noted in the DARPA Exascale Report [1]. Such large summations typically appear within fundamental numerical blocks such as dot products or numerical integrations. Hence, the result may vary from one parallel machine to another or even from one run to another. These discrepancies worsen on heterogeneous architectures – such as clusters with GPUs or Intel Xeon Phi processors – which combine programming environments that may obey various floating-point models and offer different intermediate precision or different operators [2,3]. Such non-determinism of floating-point calculations in parallel programs causes validation and debugging issues, and may lead to deadlocks [4]. The increasing power of current computers enables one to solve more and more complex problems. That, consequently, leads to a higher number of floating-point operations to be performed; each of them potentially causing a round-off error. Because of the round-off error propagation, some problems must be solved with a wider floating-point format. Two approaches exist to perform floating-point addition without incurring round-off errors. The first approach aims at computing the error that is occurred during rounding using FP expansions, which are based on an error-free transformation. FP expansions represent the result as an unevaluated sum of a fixed number of FP numbers, whose components are ordered in magnitude with minimal overlap to cover a wide range of exponents. FP expansions of sizes 2 and 4 are presented in [5] and [6], accordingly. The main advantage of this solution is that the expansion can stay in registers during the computations. But, the accuracy is insufficient for the summation of numerous FP numbers or sums with a huge dynamic range. Moreover, their complexity grows linearly with the size of the expansion. An alternative approach to expansions exploits the finite range of representable floating-point numbers by storing every bit in a very long vector of bits (accumulator). The length of the accumulator is chosen such that every bit of information of the input format can be represented; this covers the range from the minimum representable floating-point value to the maximum value, independently of the sign. For instance, Kulisch [7] proposed to utilize an accumulator of 4288 bits to handle the accumulation of products of 64-bit IEEE floating-point values. The Kulisch accumulator is a solution to produce the exact result of a very large amount of floating-point numbers of arbitrary magnitude. However, for a long period this approach was considered impractical as it induces a very large memory overhead. Furthermore, without dedicated hardware support, its performance is limited by indirect memory accesses that make vectorization challenging. We aim at addressing both issues of accuracy and reproducibility in the context of summation. We advocate to compute the correctly-rounded result of the exact sum. Besides offering strict reproducibility through an unambiguous definition of the expected result, our approach guarantees that the result ha

    Numerical validation in quadruple precision using stochastic arithmetic

    Get PDF
    International audienceDiscrete Stochastic Arithmetic (DSA) enables one to estimate rounding errors and to detect numerical instabilities in simulation programs. DSA is implemented in the CADNA library that can analyze the numerical quality of single and double precision programs. In this article, we show how the CADNA library has been improved to enable the estimation of rounding errors in programs using quadruple precision floating-point variables, i.e. having 113-bit mantissa length. Although an implementation of DSA called SAM exists for arbitrary precision programs, a significant performance improvement has been obtained with CADNA compared to SAM for the numerical validation of programs with 113-bit mantissa length variables. This new version of CADNA has been sucessfully used for the control of accuracy in quadruple precision applications, such as a chaotic sequence and the computation of multiple roots of polynomials. We also describe a new version of the PROMISE tool, based on CADNA, that aimed at reducing in numerical programs the number of double precision variable declarations in favor of single precision ones, taking into account a requested accuracy of the results. The new version of PROMISE can now provide type declarations mixing single, double and quadruple precision

    Combining learning and optimization for transprecision computing

    Get PDF
    The growing demands of the worldwide IT infrastructure stress the need for reduced power consumption, which is addressed in so-called transprecision computing by improving energy efficiency at the expense of precision. For example, reducing the number of bits for some floating-point operations leads to higher efficiency, but also to a non-linear decrease of the computation accuracy. Depending on the application, small errors can be tolerated, thus allowing to fine-tune the precision of the computation. Finding the optimal precision for all variables in respect of an error bound is a complex task, which is tackled in the literature via heuristics. In this paper, we report on a first attempt to address the problem by combining a Mathematical Programming (MP) model and a Machine Learning (ML) model, following the Empirical Model Learning methodology. The ML model learns the relation between variables precision and the output error; this information is then embedded in the MP focused on minimizing the number of bits. An additional refinement phase is then added to improve the quality of the solution. The experimental results demonstrate an average speedup of 6.5% and a 3% increase in solution quality compared to the state-of-the-art. In addition, experiments on a hardware platform capable of mixed-precision arithmetic (PULPissimo) show the benefits of the proposed approach, with energy savings of around 40% compared to fixed-precision
    • 

    corecore